All codes are given in the appendix under the corresponding part of the related questions with their explanations.

Task 1-2

Part-a-b-c

In this part, we take home odd for bookmaker x, away odd for bookmaker x, tie odd for bookmaker x, over 2.5 odd for bookmaker x, under 2.5 odd for bookmaker x, both teams to score/YES for bookmaker x, over 0.5 odd for bookmaker x, under 0.5 odd for bookmaker x for 5 different bookmakers (‘Tipico’,‘ComeOn’,‘Betsafe’,‘Unibet’,‘SBOBET’) separately, and apply PCA. Note that not all the bookmakers have given odds for every type of these bets. Final odds are taken into consideration.

Let us start with bookmaker Unibet. We see that the first two components capture 95% of the total variance. We also notice that away win odd and under 0.5 odd are found to be the features that capture the variability within all features the most (the absolute value of the eigenvalues of the first component are largest for these two bet types).

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     5.6237193 3.2417521 1.49671891 0.291045932
## Proportion of Variance 0.7104169 0.2360617 0.05032068 0.001902782
## Cumulative Proportion  0.7104169 0.9464786 0.99679927 0.998702056
##                              Comp.5       Comp.6       Comp.7       Comp.8
## Standard deviation     0.1931696533 0.1131238332 0.0715494481 4.983352e-02
## Proportion of Variance 0.0008381925 0.0002874579 0.0001149949 5.578394e-05
## Cumulative Proportion  0.9995402483 0.9998277062 0.9999427011 9.999985e-01
##                              Comp.9
## Standard deviation     8.212430e-03
## Proportion of Variance 1.514988e-06
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_Unibet_NA                           0.363  0.406  0.538         0.637
## YES_Unibet_NA                         -0.346 -0.340 -0.405         0.768
## odd1_Unibet_NA    0.109  0.576 -0.748 -0.196  0.233                     
## odd2_Unibet_NA   -0.671 -0.550 -0.398 -0.174  0.234                     
## oddX_Unibet_NA   -0.213        -0.357  0.695 -0.583                     
## over_Unibet_0.5                                                         
## over_Unibet_2.5                       -0.351 -0.458  0.655 -0.472       
## under_Unibet_0.5 -0.699  0.598  0.373 -0.105                            
## under_Unibet_2.5                       0.249  0.242 -0.329 -0.873       
##                  Comp.9
## NO_Unibet_NA           
## YES_Unibet_NA          
## odd1_Unibet_NA         
## odd2_Unibet_NA         
## oddX_Unibet_NA         
## over_Unibet_0.5  -0.994
## over_Unibet_2.5        
## under_Unibet_0.5       
## under_Unibet_2.5       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

This is the summary of the PCA for bookmaker Tipico. The first two components capture 99% of the total variance which is quite good. Away win odd again has the absolute largest eigenvalue. Note that Tipico did not give any odd for over/under 0.5.

## Importance of components:
##                           Comp.1    Comp.2      Comp.3      Comp.4
## Standard deviation     4.0468116 1.6285337 0.372282760 0.205711464
## Proportion of Variance 0.8519308 0.1379659 0.007209816 0.002201382
## Cumulative Proportion  0.8519308 0.9898967 0.997106493 0.999307875
##                              Comp.5       Comp.6       Comp.7
## Standard deviation     0.0916297713 0.0633771695 2.986738e-02
## Proportion of Variance 0.0004367687 0.0002089508 4.640585e-05
## Cumulative Proportion  0.9997446434 0.9999535942 1.000000e+00
## 
## Loadings:
##                  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
## NO_Tipico_NA                    0.352  0.283  0.634  0.188  0.598
## YES_Tipico_NA                  -0.286 -0.205 -0.480         0.800
## odd1_Tipico_NA    0.282 -0.892 -0.258  0.236                     
## odd2_Tipico_NA   -0.930 -0.164 -0.220  0.237                     
## oddX_Tipico_NA   -0.227 -0.404  0.479 -0.739                     
## over_Tipico_2.5                -0.432 -0.366  0.210  0.792       
## under_Tipico_2.5                0.513  0.293 -0.559  0.572       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.143  0.143  0.143  0.143  0.143  0.143  0.143
## Cumulative Var  0.143  0.286  0.429  0.571  0.714  0.857  1.000

This is the summary of the PCA for bookmaker ComeOn. This time, the PC1 is not as succesful as in the previous two bookmakers’ analysis at capturing the variance (69%). Away win odd is again the most significant one.

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     4.2077540 2.5021790 1.26696487 0.255015557
## Proportion of Variance 0.6895588 0.2438414 0.06251724 0.002532818
## Cumulative Proportion  0.6895588 0.9334002 0.99591743 0.998450250
##                              Comp.5       Comp.6      Comp.7       Comp.8
## Standard deviation     0.1591237336 0.0940752447 0.057532919 4.781648e-02
## Proportion of Variance 0.0009861445 0.0003446842 0.000128915 8.904834e-05
## Cumulative Proportion  0.9994363940 0.9997810782 0.999909993 9.999990e-01
##                              Comp.9
## Standard deviation     4.960737e-03
## Proportion of Variance 9.584360e-07
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_ComeOn_NA                           0.251  0.449  0.457         0.718
## YES_ComeOn_NA                         -0.262 -0.391 -0.540         0.691
## odd1_ComeOn_NA    0.323  0.725 -0.540 -0.216  0.174                     
## odd2_ComeOn_NA   -0.909        -0.313 -0.201  0.172                     
## oddX_ComeOn_NA   -0.180  0.298 -0.173  0.767 -0.505                     
## over_ComeOn_0.5                                                         
## over_ComeOn_2.5                       -0.300 -0.457  0.545  0.617       
## under_ComeOn_0.5 -0.188  0.607  0.745 -0.177                            
## under_ComeOn_2.5                       0.267  0.334 -0.438  0.779       
##                  Comp.9
## NO_ComeOn_NA           
## YES_ComeOn_NA          
## odd1_ComeOn_NA         
## odd2_ComeOn_NA         
## oddX_ComeOn_NA         
## over_ComeOn_0.5   0.993
## over_ComeOn_2.5  -0.111
## under_ComeOn_0.5       
## under_ComeOn_2.5       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

These are the summary of the PCA for bookmaker Betsafe.Note that Betsafe have given odds for every bet type. Under 0.5 odd and away win are the most significant ones. We can say that as the number of the features increases, the cumulative proportion of the first two components decreases as expected. PC1 captures 65% of the variability and PC2 captures 29%.

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     4.0139002 2.6703220 1.16365809 0.250254042
## Proportion of Variance 0.6523032 0.2886979 0.05482355 0.002535587
## Cumulative Proportion  0.6523032 0.9410011 0.99582465 0.998360241
##                             Comp.5       Comp.6       Comp.7       Comp.8
## Standard deviation     0.167760033 0.0879230415 0.0559595107 3.514488e-02
## Proportion of Variance 0.001139445 0.0003129837 0.0001267839 5.000813e-05
## Cumulative Proportion  0.999499686 0.9998126701 0.9999394540 9.999895e-01
##                              Comp.9
## Standard deviation     1.613313e-02
## Proportion of Variance 1.053788e-05
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                   Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_Betsafe_NA                           0.399  0.383  0.499  0.168  0.640
## YES_Betsafe_NA                         -0.342 -0.325 -0.435  0.380  0.659
## odd1_Betsafe_NA           0.585 -0.756 -0.138  0.240                     
## odd2_Betsafe_NA   -0.647 -0.573 -0.426 -0.120  0.239                     
## oddX_Betsafe_NA   -0.195        -0.317  0.619 -0.685                     
## over_Betsafe_0.5                                             0.109       
## over_Betsafe_2.5                       -0.437 -0.320  0.693  0.429 -0.184
## under_Betsafe_0.5 -0.728  0.568  0.362 -0.119                            
## under_Betsafe_2.5                       0.321  0.246 -0.270  0.794 -0.350
##                   Comp.9
## NO_Betsafe_NA           
## YES_Betsafe_NA          
## odd1_Betsafe_NA         
## odd2_Betsafe_NA         
## oddX_Betsafe_NA         
## over_Betsafe_0.5  -0.994
## over_Betsafe_2.5        
## under_Betsafe_0.5       
## under_Betsafe_2.5       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

These are the summary of the PCA for bookmaker SBOBET. The PC1 has captured a great amount of the total variance (%82). Away win odds are found to be the most significant for Comp1 as its loading has the largest absolute value 0.88. There was no 0.5 under/over and BTS odd given by this bookmaker.

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     2.6020452 1.1791696 0.26545200 0.183366060
## Proportion of Variance 0.8191266 0.1682186 0.00852498 0.004067797
## Cumulative Proportion  0.8191266 0.9873452 0.99587019 0.999937984
##                              Comp.5
## Standard deviation     2.264083e-02
## Proportion of Variance 6.201632e-05
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## odd1_SBOBET_NA    0.454  0.830  0.275  0.168       
## odd2_SBOBET_NA   -0.881  0.366  0.245  0.173       
## oddX_SBOBET_NA   -0.130  0.404 -0.583 -0.693       
## over_SBOBET_2.5                 0.523 -0.493 -0.691
## under_SBOBET_2.5               -0.501  0.468 -0.723
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## SS loadings       1.0    1.0    1.0    1.0    1.0
## Proportion Var    0.2    0.2    0.2    0.2    0.2
## Cumulative Var    0.2    0.4    0.6    0.8    1.0

Remarks for this part are:

  1. Tipico and SBOBET’s PCA mappings could make a relatively better distinction between different types of match outcomes (home,tie,away). They have a sharper v-shaped mapping. We could say that their odds are useful in clustering. One of the reason for that is that their feature set is smaller compared to others.

  2. We remark that the match outcome of under/over 2.5 could not be captured well in all 5 bookmakers odds’ PCA anaysis as the scores’ plot has no distinction between red and black dots. Same remark holds for MDS, neither in Euclidian distances nor in Manhattan distances, the black and red dots show no pattern in 2D Representations.

  3. Home/Tie/Away results are clearly distinguished in the plots given below on the RHS. We can say that this feature set may be successful to give insight about Home/Tie/Away results in contrast to the over/under 2.5 results. Black,green and red dots are separated on the PCA mappings over PC1 and PC2.

  4. V-Shape of the scores is quite common in real data. Two types of information (home and away) are distinguished in two edges and the meeting point forms the tie results.

  5. For especially Tipico and SBOBET, Euclidian distance MDS Representations are much more sharper V-shaped compared to the Manhattan distances which indicates that Eucld.dist. is better for MDS in this dataset, the variance of the data could be decreased significantly. In other bookmakers, there seem to be no difference.

  6. Except from Tipico and SBOBET where MDS and PCA both perform well, MDS does not overperform PCA. Note that MDS is strong when there are few dimensions which supports our interpretation of the plots because the dimension is higher in other bookmakers. Remember that there is a phenomenon called “curse of dimensionality” which says that if the number of the features gets high, the ability to interpret the inter-distances decreases.

  7. There is a layer-like mapping in the plots which is the clearest in Unibet.

Task 3

Part 2a-b

The image below shows a picture of me taken on a street. We first read the image as a variable. The variable is a 512X512X3 matrix. It is actually an ensemble of 3 separate 512X512 matrices, each matrix corresponding to a channel (R,G and B). Each value of the matrix shows the intensity of the pixel of the related channel.

Now let’s display each channel separately. We reverse and transpose the matrices to obtain a proper display (not reversed because of the image() function). The original picture is actually obtained by ‘superposing’ these three channels’ intensities.

Part 3a-b

In this part, we add a uniform random noise [0,1] to each pixel value for each channel of original image. Note that we scale the resulting pixel values so that the range stays between 0 and 1 to be able to display the image.

Now let’s display each channel separately using “image” function on a single plot.

Part 4a

In this part, we first transform the noisy image to greyscale by just summing & scaling R,G,B pixel values.

We than extract patches of size 9X9 resulting in a total number of 260100 patches for this 512X512 pixel picture. Each patch corresponds to a row and each pixel index (top-left pixel of a patch for example) corresponds to a column in the resulting dataframe (matrix in my code). We apply pca to this matrix.

## Importance of components:
##                           Comp.1      Comp.2      Comp.3      Comp.4
## Standard deviation     0.4866257 0.044197112 0.036729891 0.027268299
## Proportion of Variance 0.9748770 0.008041693 0.005553912 0.003061088
## Cumulative Proportion  0.9748770 0.982918665 0.988472577 0.991533665
##                             Comp.5      Comp.6     Comp.7      Comp.8
## Standard deviation     0.022288396 0.021347850 0.02071175 0.018658540
## Proportion of Variance 0.002045113 0.001876152 0.00176601 0.001433227
## Cumulative Proportion  0.993578778 0.995454930 0.99722094 0.998654167
##                             Comp.9
## Standard deviation     0.018080720
## Proportion of Variance 0.001345833
## Cumulative Proportion  1.000000000
## 
## Loadings:
##       Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
##  [1,]  0.332  0.497  0.175  0.546         0.464                0.271
##  [2,]  0.334  0.230  0.378 -0.149  0.455 -0.447  0.202  0.324 -0.343
##  [3,]  0.333 -0.189  0.543 -0.260 -0.405         0.378 -0.405  0.121
##  [4,]  0.333  0.405 -0.176        -0.285 -0.283 -0.424 -0.474 -0.346
##  [5,]  0.335               -0.455  0.274        -0.504         0.593
##  [6,]  0.334 -0.405  0.175        -0.285  0.285 -0.424  0.474 -0.344
##  [7,]  0.333  0.190 -0.543 -0.259 -0.406         0.378  0.405  0.122
##  [8,]  0.334 -0.230 -0.379 -0.150  0.457  0.446  0.202 -0.323 -0.343
##  [9,]  0.332 -0.498 -0.174  0.546        -0.464                0.270
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

We can see from the summary of PCA results that the first component catched %97 of the total variance which is quite good.
The eigenvectors of the first component have nearly the same values which means that each pixel intensity value in a patch has equal weight(importance) on the mappings. With this property, we could reconstruct the picture with a minimal loss in the following part.

Part 4b

As expected, the first component was able to catch the most variance. The mappings on the first component form just a slightly blurred version of the original picture whereas the mappings on the second and the third are only capable of showing borderlines. We know that the components are orthogonal to each other by definition of PCA. The second and third component give a ‘relief-like’ mapping in which the scores are of opposite sign. We can see this in the contrast between the second and the third picture below: the black areas (right side of the arm for example) in the second are shown as white in the third picture.

Note that we again scaled the mappings to [0,1] in order to be able to plot the images.

Part 4c

Each eigenvector of a component actually shows the ‘importance’ given to a specific pixel in a patch. First component’s eigenvector’s image is nearly a symmetric one. Actually, it should be shown as the same color for all 9 pixels because the eigenvectors have equal values but because of the scaling, the intensity values are changed. There is an equal weighting of the pixels of a patch in PC1.

We see that in PC2, topleft(white) and bottom right(black) pixels are valued the most. In PC3 topright and bottomleft are the emphasized ones. One can again remark this in the second and third pictures above: Comp2 Image’s lines are sharper in the upleft-bottomright direction whereas the opposite is true for Comp3(top of the right shoulder & the diagonal window of the white car behind could be catched in Comp3 Image above).

Appendix

Task1-2

require(data.table)
require(MASS)
require(jpeg)
matches_file_path="C:/Users/diaykanat/Desktop/IE 582/HW1/matches.rds"
odd_details_file_path="C:/Users/diaykanat/Desktop/IE 582/HW1/odd_details.rds"

#read the data
matches=readRDS(matches_file_path)
odds=readRDS(odd_details_file_path)

#prepare the total scores and over/under
matches=unique(matches)
matches[,c("HomeGoals","AwayGoals"):=tstrsplit(score,':')]
matches[,Year:=year(match_time)]
matches$HomeGoals=as.numeric(matches$HomeGoals)
matches[,AwayGoals:=as.numeric(AwayGoals)]
matches[,TotalGoals:=HomeGoals+AwayGoals]
matches[,IsOver:=0]
matches[TotalGoals>2.5,IsOver:=1]
matches[HomeGoals>AwayGoals,Result:=0]  #HOME
matches[HomeGoals<AwayGoals,Result:=1]  #AWAY
matches[HomeGoals==AwayGoals,Result:=2]  #TIE
matches=matches[complete.cases(matches)]

#take the final odds
odds_f=odds[,list(final_odd=odd[.N]),by=list(matchId,oddtype,bookmaker,totalhandicap)]

#try this for 5 bookmakers separately
odds_f=odds_f[bookmaker %in% c('Unibet') #'Tipico','ComeOn','Betsafe','Unibet','SBOBET'
             & (is.na(totalhandicap) | totalhandicap == 2.5 | totalhandicap == 0.5)
             & (oddtype %in% c("odd1","odd2","oddX","YES","NO","over","under"))]

#merge with the real match results
odds_f_wide=dcast(odds_f,matchId~oddtype + bookmaker + totalhandicap,value.var='final_odd')
odds_f_wide=odds_f_wide[complete.cases(odds_f_wide)]
merged_matches=merge(matches,odds_f_wide,by='matchId')

#take the result vectors
results<-merged_matches$IsOver
results2<-merged_matches$Result
merged_matches<-merged_matches[,-c(1:12)]

#apply PCA and display the results
pca=princomp(merged_matches)
str(pca)
summary(pca)
plot(pca$scores[,1],pca$scores[,2],col=results+1,pch=".",cex=7) 
plot(pca$scores[,1],pca$scores[,2],col=results2+1,pch=".",cex=7) 

#compute the distance matrices 
e = dist(merged_matches, method = "euclidean")
m = dist(merged_matches, method = "manhattan")

#apply mds
emds=cmdscale(e)
mmds=cmdscale(m)

#plot the scores and show in different colors according to the result vectors separetely for over/under 2.5 and home/tie/away cases
#put the legends separately for each bookmaker
#20 plots in total
par(mfrow = c(10,2))
plot(pca$scores[,1],pca$scores[,2],main='2D Scores of PCA\n for Unibet',cex.main = 1,col=results+1,pch=".",cex=7,xlab="Comp1 Mapped Score",ylab="Comp2 Mapped Score") 
legend("topleft", legend=c("Under 2.5","Over 2.5"),col=c(1,2), pch = ".",cex=1,text.width = 5,pt.cex = 7)
plot(pca$scores[,1],pca$scores[,2],,main='2D Scores of PCA\n for Unibet',cex.main = 1,col=results2+1,pch=".",cex=7,xlab="Comp1 Mapped Score",ylab="Comp2 Mapped Score") 
legend("topleft", legend=c("Home","Away","Tie"),col=c(1,2,3), pch = ".",cex=1,text.width = 5,pt.cex = 7)

plot(emds[,1],emds[,2],main='2D Representation of MDS Results using Euclidean Distances\n for Unibet',cex.main = 1,xlab='', ylab='',col=results+1,pch=".",cex=7)
legend("bottomleft", legend=c("Under 2.5","Over 2.5"),col=c(1,2), pch = ".",cex=1,text.width = 5,pt.cex = 7)

plot(mmds[,1],mmds[,2],main='2D Representation of MDS Results using Manhattan Distances\n for Unibet',xlab='',cex.main = 1, ylab='',col=results+1,pch=".",cex=7)
legend("bottomleft", legend=c("Under 2.5","Over 2.5"),col=c(1,2), pch = ".",cex=1,text.width = 10,pt.cex = 7)

Task 3

#Read the image
resim <- readJPEG("HW2_Resim.jpg")

#display the image
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(resim,0,0,1,1)
dim(resim)

par(mfrow=c(1,3))
resimr1 <- apply(resim[,,1], 2, rev)
resimr2 <- apply(resim[,,2], 2, rev)
resimr3 <- apply(resim[,,3], 2, rev)

#display each channel separately, take the reverse and the transpose in order to get the original picture with the image function
image(t(resimr1),ann=FALSE,axes=FALSE,col = grey(seq(0, 1, length = 256)))
image(t(resimr2),ann=FALSE,axes=FALSE,col = grey(seq(0, 1, length = 256)))
image(t(resimr3),ann=FALSE,axes=FALSE,col = grey(seq(0, 1, length = 256)))

m<-resim

#add uniform noise between [0,0.1] to every channel separately
m[,,1]<-resim[,,1] + matrix(runif(512*512,0,0.1),512,512)
m[,,2]<-resim[,,2] + matrix(runif(512*512,0,0.1),512,512)
m[,,3]<-resim[,,3] + matrix(runif(512*512,0,0.1),512,512)

#scale the values in order to display
m<-m/max(m)
#display the image with added noise
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(m,0,0,1,1)

#display each channel of the noisy image
#display each channel separately, take the reverse and the transpose in order to get the original picture with the image function
par(mfrow=c(1,3))
mr1 <- apply(m[,,1], 2, rev)
mr2 <- apply(m[,,2], 2, rev)
mr3 <- apply(m[,,3], 2, rev)

image(t(mr1),ann=FALSE,axes=FALSE,col = grey(seq(0, 1, length = 256)))
image(t(mr2),ann=FALSE,axes=FALSE,col = grey(seq(0, 1, length = 256)))
image(t(mr3),ann=FALSE,axes=FALSE,col = grey(seq(0, 1, length = 256)))

#take the average of the three channels to convert to grayscale
g<-m[,,1]+m[,,2]+m[,,3]
#normalize the values
g<-g/max(g)
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(g,0,0,1,1)

#determine the matrix of patches
count = 0
liste <-list(NULL)
for(i in c(1:510)){for(j in c(1:510)){
  vect<-c(((t(g[c(i:(i+2)),c(j:(j+2))]))))
  count = count + 1
  liste[[count]] = vect
}}

liste <-t(as.data.table(liste[]))

#reset the row indexes
#scale each channel intensities
#display each channel side by side
rownames(liste) <-NULL
pca<-princomp(liste)

#PC1
z<-pca$scores[,1]
z<-((z-min(z))/(max(z)-min(z)))
z<-t(matrix(z,510,510))

par(mfrow=c(1,3))
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(z,0,0,1,1)

#PC2
y<-pca$scores[,2]
y<-((y-min(y))/(max(y)-min(y)))
y<-t(matrix(y,510,510))

plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(y,0,0,1,1)

#PC3
x<-pca$scores[,3]
x<-((x-min(x))/(max(x)-min(x)))
x<-t(matrix(x,510,510))

plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(x,0,0,1,1)

#part c
#display the eigenvectors(loadings) matrix as an image
#do the scaling for every component
l1<-matrix(pca$loadings[,1],3,3)
l1<-((l1-min(l1))/(max(l1)-min(l1)))
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(l1,0,0,1,1)

l2<-matrix(pca$loadings[,2],3,3)
l2<-((l2-min(l2))/(max(l2)-min(l2)))
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(l2,0,0,1,1)

l3<-matrix(pca$loadings[,3],3,3)
l3<-((l3-min(l3))/(max(l3)-min(l3)))
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(l3,0,0,1,1)